Information analysis apparatus and computer readable medium

ABSTRACT

An information analysis apparatus includes: a storage that stores data values while respectively correlating the data values with plural nodes; a first setting unit that, for the nodes, sets a first virtual link that is directed oppositely to a predetermined directed link; a second setting unit that adds a virtual nodes to the nodes, and that sets a second virtual link which is bidirectional between the added virtual node and each of the nodes; and a updating unit that updates data values respectively correlated with the nodes, on the basis of respective weights of predetermined links between the nodes, the first virtual link, and the second virtual link.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is based on and claims priority under 35 U.S.C. 119from Japanese Patent Application No. 2007-326563 filed Dec. 18, 2007.

BACKGROUND

1. Technical Field

The present invention relates to an information analysis apparatus and acomputer readable medium.

2. Related Art

A citation network is configured by linking groups of scientific andtechnical literatures such as scientific papers and patents to oneanother through “citation relationship”. When such a citation network isanalyzed, it is possible to obtain useful information.

SUMMARY

According to an aspect of the present invention, an information analysisapparatus includes: a storage that stores data values while respectivelycorrelating the data values with a plurality of nodes; a first settingunit that, for the nodes, sets a first virtual link that is directedoppositely to a predetermined directed link; a second setting unit thatadds a virtual nodes to the nodes, and that sets a second virtual linkwhich is bidirectional between the added virtual node and each of thenodes; and a updating unit that updates data values respectivelycorrelated with the nodes, on the basis of respective weights ofpredetermined links between the nodes, the first virtual link, and thesecond virtual link.

BRIEF DESCRIPTION OF THE DRAWINGS

Exemplary embodiment of the present invention will be described indetail based on the following figures, wherein:

FIG. 1 is a functional block diagram of an information analysisapparatus of an embodiment;

FIGS. 2A and 2B are views showing a part of a citation network in whichcitation relationships among documents are configured as links; and

FIG. 3 is a flowchart illustrating a series of flows of a citationnetwork analyzing process which is performed by the information analysisapparatus.

DETAILED DESCRIPTION

Hereinafter, exemplary embodiments (hereinafter, referred to asembodiments) to implement the invention will be described with referenceto the drawings.

Example 1

FIG. 1 is a functional block diagram of an information analysisapparatus 10 of a first embodiment of the invention. As shown in FIG. 1,the information analysis apparatus 10 includes a data storage portion20, a virtual-link setting portion 22, a virtual-node adding portion 24,a link weight calculating portion 26, a data processing portion 28, anda result outputting portion 30. The functions of the portions may berealized by operating the information analysis apparatus 10 which is acomputer system, in accordance with computer programs. The computerprograms may be stored in an information recording medium of any formwhich is readable by a computer, such as a CD-ROM, a DVD-ROM, or a flashmemory, and read into the information analysis apparatus 10 by a mediumreading apparatus which is connected to the information analysisapparatus 10, and which is not shown. Alternatively, the computerprograms may be downloaded to the information analysis apparatus 10through a network.

The data storage portion 20 stores information of plural objects (nodes)in which directional relationships and weights are previously defined.In the embodiment, the objects to be processed are documents which havecitation relationships with respect to other documents, such as patentpublications or scientific papers, and the directional relationshipsamong the objects are expressed by citation relationships.

In the virtual-link setting portion 22, with respect to a document groupstored in the data storage portion 20, a virtual link which is directedoppositely to a predetermined directed link is disposed, and eachdirected link is bidirectionalized. The virtual-link setting portion 22may determine whether a directed link between nodes is bidirectional ornot, and perform a process of disposing a virtual oppositely directedlink on the basis of a result of the determination.

The virtual-node adding portion 24 adds a virtual node to a node groupof a process object, and set a virtual bidirectional link between theadded node and nodes to be processed.

The processes of the virtual-link setting portion 22 and thevirtual-node adding portion 24 will be specifically described withreference to FIGS. 2A and 2B. FIG. 2A exemplarily shows a part of acitation network in which, in a document group configured by pluraldocuments stored in the data storage portion 20, the documents are setas nodes, respectively, and citation relationships among the documentsare configured as links.

In the citation network shown in FIG. 2A, links based on predeterminedcitation relationships among the nodes are shown. As shown in FIG. 2A,nodes a, b, and c have the illustrated directional relationships inwhich the node b cites the node a, and the node c cites the nodes a andb. The node a is a node in which there is no link (out-link) directed toanother node. A node in which there is no out-link, such as the node a,and that in which there is an out-link, such as the nodes b and c cannotbe similarly handled in regularization (which will be described later)of an adjacency matrix in the case where no countermeasure is taken. Inthe embodiment, therefore, the citation network is expanded as shown inFIG. 2B in order to enable all nodes to be unifiedly handled.

In the embodiment, as shown in FIG. 2B, namely, a virtual node (N+1) isadded to the citation network, and directed links are virtuallybidirectionalized. The links indicated by broken lines in FIG. 2B arevirtual links, and the node indicated by a broken line is the virtualnode (N+1) which is newly added. The virtual node (N+1) hasbidirectional links with respect not only to the illustrated nodes a, b,and c, but also to all other nodes which are not illustrated. When thenumber of elements of a node group of a process object which isoriginally included in the citation network is N, the element number ofthe virtual node which is newly added is N+1.

Next, a calculation which is performed in the link weight calculatingportion 26 for calculating the weights of the links will be described.In the embodiment, as information indicating the citation relationshipsamong the nodes, a matrix (adjacency matrix) A indicating the citationnetwork is defined in the following manner. In the case where the numberof elements of a node group of a process object is N, the matrix A isdefined as a matrix of N×N. The element numbers of documents may beallocated in the order in which the documents are produced. The i-rowand j-column component of the adjacency matrix A (1≦i≦N, 1≦j≦N) isindicated by A_(ij). Then, A_(ij) is defined by following Expressions(1) to (3). When a document j cites a document i, namely, the followingis set:

A_(ij)=w  (1)

When not, the following is set:

A_(ij)=0  (2)

Hereinafter, the weight w of a link is uniformly defined as w=1. Eachdocument does not cite itself, and hence

A_(ii)=0  (3)

When the above-described matrix A is used, the number (out-link number)k_(out)(j) of documents which are cited by the document j is indicatedby following Expression (4):

$\begin{matrix}{{k_{out}(j)} = {\sum\limits_{i = 1}^{N}A_{ij}}} & (4)\end{matrix}$

The number (in-link number) k_(in)(j) of documents which cite thedocument j is indicated by following Expression (5):

$\begin{matrix}{{k_{i\; n}(j)} = {\sum\limits_{i = 1}^{N}A_{ij}}} & (5)\end{matrix}$

When the virtual node (N+1-th node) is added, k_(out)(j) and k_(in)(j)are corrected as shown in following Expressions (6) and (7). The virtualnode is a node which has a bidirectional link with all of the nodes.

{tilde over (k)} _(out)(j)=k _(out)(j)+E  (6)

{tilde over (k)}_(out)(N+1)=NF  (7)

In the above expressions, E is the weight of a link (out-link) of eachnode to the virtual node, and F is the weight of a link (in-link) ofeach node from the virtual node. When the matrix A is regularized, thematrix A is indicated by following Expressions (8) to (11). Theregularization means an operation of causing the total sum of weights ofcited documents to be equal to a predetermined value m, i.e., that ofcorrecting the out-link number tom. This operation is performed on anarbitrary document j. Actually, an individual document cites variousnumbers of other literatures. The above operation corresponds to that ofregularizing the number to m. The above means that, in the case whereranking of documents is calculated with using a dynamic technique suchas the spreading activation, the continuous fixed point attractordynamics, or the virtual random walk, the ranking of the documents isdetermined while setting the number at which each of the documents iscited, in place of the number of literatures cited by each of thedocuments (a larger number does not mean that the value is higher, and asmaller number does not mean that the value is lower), as a main factor.

$\begin{matrix}{{\overset{\sim}{A}}_{ij} = {\frac{m}{{\overset{\sim}{k}}_{out}(j)}A_{ij}}} & (8) \\{{\overset{\sim}{A}}_{N + {1j}} = {\frac{m}{{\overset{\sim}{k}}_{out}(j)}E}} & (9) \\{{\overset{\sim}{A}}_{{iN} + 1} = {{\frac{m}{{\overset{\sim}{k}}_{out}\left( {N + 1} \right)}F} = \frac{m}{N}}} & (10) \\{{\overset{\sim}{A}}_{N + {1N} + 1} = 0} & (11)\end{matrix}$

In the above expressions, 1≦I≦N+1 and 1≦J≦N+1.

In each node, the total of the weights of out-links and in-links isindicated by following Expressions (12) to (15):

$\begin{matrix}\begin{matrix}{{\sum\limits_{{I = 1},{I \neq j}}^{N + 1}{\overset{\sim}{A}}_{Ij}} = {{\sum\limits_{{i = 1},{i \neq j}}^{N}{\overset{\sim}{A}}_{ij}} + {\overset{\sim}{A}}_{N + {1j}}}} \\{= {{\sum\limits_{{i = 1},{i \neq j}}^{N}{\frac{m}{{\overset{\sim}{k}}_{out}(j)}A_{ij}}} + {\frac{m}{{\overset{\sim}{k}}_{out}(j)}E}}} \\{= {{\frac{m}{{\overset{\sim}{k}}_{out}(j)}{\sum\limits_{{i = 1},{i \neq j}}^{N}A_{ij}}} + {\frac{m}{{\overset{\sim}{k}}_{out}(j)}E}}} \\{= \frac{m\left( {{k_{out}(j)} + E} \right)}{{\overset{\sim}{k}}_{out}(j)}} \\{= m}\end{matrix} & (12) \\{{\sum\limits_{{I = 1},{I \neq {N + 1}}}^{N + 1}{\overset{\sim}{A}}_{{IN} + 1}} = {{\sum\limits_{i = 1}^{N}{\overset{\sim}{A}}_{{iN} + 1}} = {{\sum\limits_{i = 1}^{N}\frac{m}{n}} = m}}} & (13) \\\begin{matrix}{{\sum\limits_{{I = 1},{I \neq j}}^{N + 1}{\overset{\sim}{A}}_{jI}} = {{\sum\limits_{{i = 1},{i \neq j}}^{N}{\overset{\sim}{A}}_{ji}} + {\overset{\sim}{A}}_{{jN} + 1}}} \\{= {{\sum\limits_{{i = 1},{i \neq j}}^{N}{\frac{m}{{\overset{\sim}{k}}_{out}(i)}A_{ji}}} + \frac{m}{N}}} \\{= {m\left( {{\sum\limits_{{i = 1},{i \neq j}}^{N}\frac{A_{ji}}{{\overset{\sim}{k}}_{out}(i)}} + \frac{1}{N}} \right)}} \\{= {m\left( {{\overset{\sim}{\kappa}(j)} + \frac{1}{N}} \right)}}\end{matrix} & (14) \\\begin{matrix}{{\sum\limits_{{I = 1},{I \neq {N + 1}}}^{N + 1}{\overset{\sim}{A}}_{N + {1I}}} = {\sum\limits_{i = 1}^{N}{\overset{\sim}{A}}_{N + {1i}}}} \\{= {\sum\limits_{i = 1}^{N}{\frac{m}{{\overset{\sim}{k}}_{out}(i)}E}}} \\{= {{mE}{\sum\limits_{i = 1}^{N}\frac{1}{{\overset{\sim}{k}}_{out}(i)}}}} \\{= {{mE}\overset{\sim}{K}}}\end{matrix} & (15)\end{matrix}$

In the above expressions,

$\begin{matrix}{{\overset{\sim}{\kappa}(j)} = {\sum\limits_{{i = 1},{i \neq j}}^{N}\frac{A_{ji}}{{\overset{\sim}{k}}_{out}(i)}}} & (16) \\{\overset{\sim}{K} = {\sum\limits_{i = 1}^{N}\frac{1}{{\overset{\sim}{k}}_{out}(i)}}} & (17)\end{matrix}$

With respect to a unidirectioned link, a virtual oppositely directedlink is set, so that the link is bidirectioned. Namely, an adjacencymatrix defining a bidirectional link is expressed by followingExpression (18):

$\begin{matrix}\left\{ \begin{matrix}{{\overset{\_}{A}}_{IJ} = {A_{IJ} + A_{JI}}} & \left( {I \neq J} \right) \\{{\overset{\_}{A}}_{II} = 0} & \;\end{matrix} \right. & (18)\end{matrix}$

A bidirectioned adjacency matrix is specifically expressed by followingExpressions (19) to (22):

$\begin{matrix}{{\overset{\_}{A}}_{ij} = {{\frac{m}{{\overset{\sim}{k}}_{out}(j)}A_{ij}} + {\frac{m}{{\overset{\sim}{k}}_{out}(i)}A_{ji}}}} & (19) \\{{\overset{\_}{A}}_{N + {1j}} = {{\frac{m}{{\overset{\sim}{k}}_{out}(j)}E} + \frac{m}{N}}} & (20) \\{{\overset{\_}{A}}_{{iN} + 1} = {\frac{m}{N} + {\frac{m}{{\overset{\sim}{k}}_{out}(i)}E}}} & (21) \\{{\overset{\_}{A}}_{N + {1N} + 1} = 0} & (22)\end{matrix}$

In the above,

Ā_(IJ)

is a symmetric matrix. Therefore, the following is attained:

Ā_(IJ)=Ā_(JI)  (23)

and k(j) and k(N+1) are expressed by following Expressions (24) and(25), respectively:

$\begin{matrix}\begin{matrix}{{k(j)} = {\sum\limits_{{I = 1},{I \neq j}}^{N + 1}{\overset{\_}{A}}_{Ij}}} \\{= {{\sum\limits_{{I = 1},{i \neq j}}^{N + 1}{\overset{\sim}{A}}_{Ij}} + {\sum\limits_{{I = 1},{i \neq j}}^{N + 1}{\overset{\sim}{A}}_{jI}}}} \\{= {m + {m\left( {{\overset{\sim}{\kappa}(j)} + \frac{1}{N}} \right)}}}\end{matrix} & (24) \\\begin{matrix}{{k\left( {N + 1} \right)} = {\sum\limits_{{I = 1},{I \neq {N + 1}}}^{N + 1}{\overset{\_}{A}}_{{IN} + 1}}} \\{= {{\sum\limits_{{I = 1},{I \neq {N + 1}}}^{N + 1}{\overset{\sim}{A}}_{{IN} + 1}} + {\sum\limits_{{I = 1},{I \neq {+ 1}}}^{N + 1}{\overset{\sim}{A}}_{N + {1I}}}}} \\{= {m\left( {1 + {E\; \overset{\sim}{Κ}}} \right)}}\end{matrix} & (25)\end{matrix}$

Next, the bidirectioned adjacency matrix

Āi_(IJ)

is normalized by k(j), and a transition probability matrix T_(IJ) inwhich the weight of a link is expressed by a transition probabilitybetween nodes is produced. The transition probability matrix T_(IJ) isexpressed by following Expression (26):

$\begin{matrix}\left\{ \begin{matrix}{T_{IJ} = {\frac{{\overset{\_}{A}}_{IJ}}{k(J)}\mspace{20mu} \left( {I \neq J} \right)}} \\{T_{II} = 0}\end{matrix} \right. & (26)\end{matrix}$

Specifically, T_(IJ) is expressed by following Expressions (27) to (30):

$\begin{matrix}\begin{matrix}{T_{ij} = {\frac{1}{k(j)}{\overset{\_}{A}}_{ij}}} \\{= {\frac{1}{m + {m\left( {{\overset{\sim}{\kappa}(j)} + \frac{1}{N}} \right)}}\left( {{\frac{m}{{\overset{\sim}{k}}_{out}(j)}A_{ij}} + {\frac{m}{{\overset{\sim}{k}}_{out}(i)}A_{ji}}} \right)}} \\{= {\frac{1}{1 + {\overset{\sim}{\kappa}(j)} + \frac{1}{N}}\left( {{\frac{1}{{\overset{\sim}{k}}_{out}(j)}A_{ij}} + {\frac{1}{{\overset{\sim}{k}}_{out}(i)}A_{ji}}} \right)}}\end{matrix} & (27) \\\begin{matrix}{T_{N + {1j}} = {\frac{1}{k(j)}{\overset{\_}{A}}_{N + {1j}}}} \\{= {\frac{1}{m + {m\left( {{\overset{\sim}{\kappa}(j)} + \frac{1}{N}} \right)}}\left( {\frac{mE}{{\overset{\sim}{k}}_{out}(j)} + \frac{m}{N}} \right)}} \\{= {\frac{1}{1 + {\overset{\sim}{\kappa}(j)} + \frac{1}{N}}\left( {\frac{E}{{\overset{\sim}{k}}_{out}(j)} + \frac{1}{N}} \right)}}\end{matrix} & (28) \\\begin{matrix}{T_{{iN} + 1} = {\frac{1}{k\left( {N + 1} \right)}{\overset{\_}{A}}_{{iN} + 1}}} \\{= {\frac{1}{m\left( {1 + {E\; \overset{\sim}{Κ}}} \right)}\left( {\frac{m}{N} + \frac{mE}{{\overset{\sim}{k}}_{out}(i)}} \right)}} \\{= {\frac{1}{\left( {1 + {E\; \overset{\sim}{Κ}}} \right)}\left( {\frac{1}{N} + \frac{E}{{\overset{\sim}{k}}_{out}(i)}} \right)}}\end{matrix} & (29) \\{T_{N + {1N} + 1} = 0} & (30)\end{matrix}$

As apparent also from Expressions (27) to (30) above, the transitionprobability matrix T_(IJ) does not depend on the value of m which is setas the total sum of out-links per node.

The link weight calculating portion 26 stores the transition probabilitymatrix T_(IJ) in which the weights of the links calculated by theabove-described process are stored, into the data storage portion 20.

The data processing portion 28 implements calculation of evaluationvalues of the nodes until predetermined termination conditions aresatisfied, on the basis of evaluation values (ranks) of documents storedin the data storage portion 20 and the transition probability matrixT_(IJ) indicating the weights of the links between nodes, in accordancewith a predetermined algorithm (for example, the page rank algorithm,the spreading activation, or the continuous fixed point attractordynamics). A dynamic technique is disclosed in, for example,JP-A-2006-133844, JP-A-2006-243804, and JP-A-2006-060124. In the case ofthe page rank algorithm, for example, the predetermined terminationconditions may be whether or not a predetermined equilibrium state suchas that the total sum of evaluation values “flowing into” the nodesthrough the links is equal to that of evaluation values “flowing out”from the nodes through the links are satisfied.

The result outputting portion 30 outputs a process result on the basisof the evaluation values of the nodes which are calculated by the dataprocessing portion 28. The process result may be output in the form of alist in which the evaluation values are arranged in descending order, orin a graph structure in which, as the evaluation value of a certain nodeis higher, the size of the node is larger. The result outputting portion30 may display the obtained process result on a display device connectedto the information analysis apparatus 10, or output by printing theprocess result.

Next, a series of flows of the process of ranking documents constitutinga citation network and conducted by the information analysis apparatus10 of the embodiment will be described with reference to FIG. 3.

For a citation network in which plural documents are set as respectivenodes and citation relationships among the documents are expressed bydirectioned links, the information analysis apparatus 10 produces anadjacency matrix on the basis of relationships which are previously setamong the nodes (S101). Next, the information analysis apparatus 10 addsa virtual node to the citation network, and sets bidirectional linksbetween the nodes and the virtual node to expand the adjacency matrix(S102). With respect to unidirected ones of directed links which arepreviously set among the nodes, the information analysis apparatus 10virtually sets links which are paired with the unidirected links, andwhich are opposite in direction to the unidirected links, whereby thelinks are bidirectionalized to correct the adjacency matrix (S103).

The information analysis apparatus 10 normalizes the corrected adjacencymatrix on the basis of the link number of each node, to produce atransition probability matrix (S104). The information analysis apparatusupdates the evaluation values of the documents on the basis of theevaluation values (ranks) correlated with the respective documents andthe produced transition probability matrix (S105). The updation of theevaluation values is repeated until the relationships of the evaluationvalues of the documents reach predetermined termination conditions(equilibrium state) according to the algorithm (S106). As the algorithm,a known algorithm such as the page rank algorithm, the spreadingactivation, or the continuous fixed point attractor dynamics may beused.

When it is determined that the evaluation values of the documentssatisfy the predetermined termination conditions (S106: Y), theinformation analysis apparatus 10 produces a graph structure in whichthe sizes of nodes are changed on the basis of the evaluation valuesdetermined for the documents, for example, in accordance with the sizesof the evaluation values, and displays the graph on the display device(S107).

According to the above-described information analysis apparatus 10 ofthe embodiment, nodes having an out-link and those having no out-linkcan be handled by a unified technique, and the dependency of an analysisresult on the value of m which is set as the total sum of numbers ofout-links per node of a node group of the calculation object can beeliminated. Therefore, a higher reliable ranking result can be obtained.

Next, other embodiments of the invention will be described.

Example 2

The information analysis apparatus 10 of the second embodiment isdifferent from the above-described information analysis apparatus 10 ofthe first embodiment in the following points. In the informationanalysis apparatus 10 of the first embodiment, the technique in which avirtual node is added, and a node having no out-link is eliminated sothat all nodes are unifiedly handled, and which does not depend on thetotal sum m of out-links has been proposed. In the second embodiment,the dependency on the total sum m of out-links is eliminated, but nodeshaving no out-link are handled differently from those having anout-link.

In the information analysis apparatus 10 of the second embodiment,directed links are bidirectionalized without adding a virtual node, andthen the link weight calculating portion 26 calculates the weights ofthe links in the following manner. First, the link weight calculatingportion 26 regularizes the adjacency matrix A_(ij) as shown in followingExpression (31), or in different manners according to whether a node hasan out-link or not.

$\begin{matrix}{{\overset{\sim}{A}}_{ij} = \left\{ \begin{matrix}{\frac{m}{k_{out}(j)}A_{ij}} & \left( {{k_{out}(j)} \neq 0} \right) \\0 & \left( {{k_{out}(j)} = 0} \right)\end{matrix} \right.} & (31)\end{matrix}$

Each node does not cite itself, and hence the following is obtained:

Ã_(ii)=0  (32)

Here, the adjacency matrix is bidirectioned as shown in followingExpressions (33) and (34):

Ā _(ij) =Ã _(ij) +Ã _(ji)  (33)

Ā_(ii)=0  (34)

At this time, the link k(j) of each node is expressed by followingExpression (35) depending on whether the node has an out-link or not:

$\begin{matrix}\begin{matrix}{{k(j)} = {\sum\limits_{i = 1}^{N}{\overset{\_}{A}}_{ij}}} \\{= {{\sum\limits_{i = 1}^{N}{\overset{\sim}{A}}_{ij}} + {\sum\limits_{i = 1}^{N}{\overset{\sim}{A}}_{ji}}}} \\{= \left\{ \begin{matrix}{{m + {m{\sum\limits_{i = 1}^{N}\frac{A_{ji}}{k_{out}(i)}}}} = {m\left( {1 + {\kappa (j)}} \right)}} & \left( {{k_{out}(j)} \neq 0} \right) \\{{m{\sum\limits_{i = 1}^{N}\frac{A_{ji}}{k_{out}(i)}}} = {m\; {\kappa (j)}}} & \left( {{k_{out}(j)} = 0} \right)\end{matrix} \right.}\end{matrix} & (35)\end{matrix}$

where

$\begin{matrix}{{\kappa (j)} = {\sum\limits_{i = 1}^{N}\frac{A_{ji}}{k_{out}(i)}}} & (36)\end{matrix}$

Here, the bidirectioned adjacency matrix

Ā_(ij)

is normalized by k(j), and a transition probability matrix T_(ij) isobtained. In the case where j cites i,

k_(out)(j)≠0  (37)

and hence the following is obtained:

$\begin{matrix}\begin{matrix}{T_{ij} = {\frac{1}{k(j)}{\overset{\_}{A}}_{ij}}} \\{= {\frac{1}{k(j)}{\overset{\sim}{A}}_{ij}}} \\{= {\frac{1}{m\left( {1 + {\kappa (j)}} \right)}\frac{m}{k_{out}(j)}}} \\{= \frac{1}{\left( {1 + {\kappa (j)}} \right){k_{out}(j)}}}\end{matrix} & (38)\end{matrix}$

In the case where i cites j, the following is obtained:

$\begin{matrix}\begin{matrix}{T_{ij} = {\frac{1}{k(j)}{\overset{\_}{A}}_{ij}}} \\{= {\frac{1}{k(j)}{\overset{\sim}{A}}_{ji}}} \\{= \left\{ \begin{matrix}{{\frac{1}{m\left( {1 + {\kappa (j)}} \right)}\frac{m}{k_{out}(i)}} = \frac{1}{\left( {1 + {\kappa (j)}} \right){k_{out}(i)}}} & \left( {{k_{out}(j)} \neq 0} \right) \\{{\frac{1}{m\; {\kappa (j)}}\frac{m}{k_{out}(i)}} = \frac{1}{{\kappa (j)}{k_{out}(i)}}} & \left( {{k_{out}(j)} = 0} \right)\end{matrix} \right.}\end{matrix} & (39)\end{matrix}$

As apparent also from Expressions (38) and (39) above, the transitionprobability matrix T_(ij) does not depend on the value of m which is setas the total sum of out-links per node. When the transition probabilitymatrix T_(ij) is used, therefore, it is possible to obtain a highlyreliable ranking result.

Example 3

Next, a third embodiment of the invention will be described. The thirdembodiment is similar to the first embodiment in that nodes having anout-link and those having no out-link can be unifiedly handled, and thedependency on the set value m of the total sum of out-links per node iseliminated, but different therefrom in that a virtual node is not added.

In the information analysis apparatus 10 of the third embodiment,directed links are bidirectionalized without adding a virtual node, andthen the link weight calculating portion 26 calculates the weights ofthe links in the following manner. First, the link weight calculatingportion 26 corrects the adjacency matrix A_(ij) on the basis offollowing Expressions (40) and (41). When i and j are not equal to eachother, ε>0 is set, and the following is obtained:

$\begin{matrix}{{\overset{\sim}{A}}_{ij} = {\frac{m}{{k_{out}(j)} + ɛ}\left( {A_{ij} + \frac{ɛ}{N - 1}} \right)}} & (40)\end{matrix}$

When i and j are equal to each other, the following is obtained:

Ã_(ii)=0  (41)

With respect to the corrected adjacency matrix, following Expressions(42) and (43) hold:

$\begin{matrix}{{\sum\limits_{{i = 1},{i \neq j}}^{N}{\overset{\sim}{A}}_{ij}} = m} & (42) \\{{\sum\limits_{{i = 1},{i \neq j}}^{N}{\overset{\sim}{A}}_{ji}} = {m\; {\kappa (j)}}} & (43)\end{matrix}$

where

$\begin{matrix}{{\kappa (j)} = {\sum\limits_{{i = 1},{i \neq j}}^{N}\frac{A_{ji} + \frac{ɛ}{N - 1}}{{k_{out}(i)} + ɛ}}} & (44)\end{matrix}$

Next, the corrected adjacency matrix is bidirectionalized. When i and jare not equal to each other,

Ā _(ij) =Ã _(ij) +Ã _(ji)  (45)

and, when i and j are equal to each other,

Ā_(ii)=0  (46)

Of course, the bidirectionalized adjacency matrix is a symmetric matrix,and hence the following is attained:

Ā_(ij)=Ā_(ji)  (47)

Then, k(j) is expressed by following Expression (48):

$\begin{matrix}\begin{matrix}{{k(j)} = {\sum\limits_{{i = 1},{i \neq j}}^{N}{\overset{\_}{A}}_{ij}}} \\{= {{\sum\limits_{{i = 1},{i \neq j}}^{N}{\overset{\sim}{A}}_{ij}} + {\sum\limits_{{i = 1},{i \neq j}}^{N}{\overset{\sim}{A}}_{ji}}}} \\{= {m + {m\; {\kappa (j)}}}} \\{= {m\left( {1 + {\kappa (j)}} \right)}}\end{matrix} & (48)\end{matrix}$

Here, the bidirectionalized adjacency matrix

ā_(ij)

is normalized by k(j), and a transition probability matrix T_(ij) whichis expressed by following Expression (49) is obtained:

$\begin{matrix}\left\{ \begin{matrix}{T_{ij} = {\frac{{\overset{\_}{A}}_{ij}}{k(j)}\mspace{20mu} \left( {i \neq j} \right)\;}} \\{T_{ii} = 0}\end{matrix} \right. & (49)\end{matrix}$

In the above, T_(ij) is usually an asymmetric matrix. For T_(ij),following Expression (50) holds:

$\begin{matrix}\begin{matrix}{T_{ij} = {\frac{1}{k(j)}{\overset{\_}{A}}_{ij}}} \\{= {\frac{1}{k(j)}\left( {{\overset{\sim}{A}}_{ij} + {\overset{\sim}{A}}_{ji}} \right)}} \\{= {\frac{1}{m\left( {1 + {\kappa (j)}} \right)}\begin{pmatrix}{{\frac{m}{{k_{out}(j)} + ɛ}\left( {A_{ij} + \frac{ɛ}{N - 1}} \right)} +} \\{\frac{m}{{k_{out}(i)} + ɛ}\left( {A_{ji} + \frac{ɛ}{N - 1}} \right)}\end{pmatrix}}} \\{= {\frac{1}{\left( {1 + {\kappa (j)}} \right)}\begin{pmatrix}{{\frac{1}{{k_{out}(j)} + ɛ}\left( {A_{ij} + \frac{ɛ}{N - 1}} \right)} +} \\{\frac{1}{{k_{out}(i)} + ɛ}\left( {A_{ji} + \frac{ɛ}{N - 1}} \right)}\end{pmatrix}}}\end{matrix} & (50)\end{matrix}$

As apparent also from Expression (50) above, the transition probabilitymatrix T_(ij) does not depend on m. According to the embodiment, nodeshaving an out-link and those having no out-link can be unifiedlyhandled.

The invention is not restricted to the above-described embodiments, andmay of course be variously changed, modified, or replaced by thoseskilled in the art.

The foregoing description of the embodiments of the present inventionhas been provided for the purposes of illustration and description. Itis not intended to be exhaustive or to limit the invention to theprecise forms disclosed. Obviously, many modifications and variationswill be apparent to practitioners skilled in the art. The embodimentswere chosen and described in order to best explain the principles of theinvention and its practical applications, thereby enabling othersskilled in the art to understand the invention for various embodimentsand with the various modifications as are suited to the particular usecontemplated. It is intended that the scope of the invention defined bythe following claims and their equivalents.

1. An information analysis apparatus comprising: a storage that storesdata values while respectively correlating the data values with aplurality of nodes; a first setting unit that, for the nodes, sets afirst virtual link that is directed oppositely to a predetermineddirected link; a second setting unit that adds a virtual nodes to thenodes, and that sets a second virtual link which is bidirectionalbetween the added virtual node and each of the nodes; and a updatingunit that updates data values respectively correlated with the nodes, onthe basis of respective weights of predetermined links between thenodes, the first virtual link, and the second virtual link.
 2. Theinformation analysis apparatus as claimed in claim 1, furthercomprising: a third setting unit that sets each of the weights of thepredetermined links between the nodes, the first virtual link, and thesecond virtual link, on the basis of a probability of transition fromone node connected to the link to another node.
 3. An informationanalysis apparatus comprising: a storage that stores data values whilerespectively correlating the data values with a plurality of nodes; afirst setting unit that, for the nodes, sets a virtual link that isdirected oppositely to a predetermined directed link; a second settingunit that sets each of weights of the predetermined links between thenodes and the virtual link, on the basis of a probability of transitionfrom one node connected to the link to another node; and a updating unitthat updates data values respectively correlated with the nodes, on thebasis of respective weights of predetermined links between the nodes andthe virtual link.
 4. A computer readable medium storing a programcausing a computer to execute a process for performing an analysis in anetwork configured by a plurality of nodes where a directed link is set,the process comprising: storing data values while respectivelycorrelating the data values with the nodes; setting a first virtual linkthat is directed oppositely to a predetermined directed link, for thenodes; adding a virtual node to the nodes, and setting a second virtuallink which is bidirectional between the virtual node and each of thenodes; and updating data values respectively correlated with the nodes,on the basis of respective weights of predetermined links between thenodes, the first virtual link, and the second virtual link.
 5. Acomputer readable medium storing a program causing a computer to executea process for performing an analysis in a network configured by aplurality of nodes where a directed link is set, the process comprising:storing data values while respectively correlating the data values withthe nodes; setting a virtual link that is directed oppositely to apredetermined directed link, for the nodes; setting each of weights ofthe predetermined links between the nodes and the virtual link, on thebasis of a probability of transition from one node connected to the linkto another node; and updating data values respectively correlated withthe nodes, on the basis of respective weights of predetermined linksbetween the nodes and the virtual link.